{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+++\n",
    "title = \"Multiple correspondence analysis\"\n",
    "menu = \"main\"\n",
    "weight = 3\n",
    "toc = true\n",
    "aliases = [\"mca\"]\n",
    "+++"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Resources"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- [Computation of Multiple Correspondence Analysis, with code in R](https://core.ac.uk/download/pdf/6591520.pdf)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data\n",
    "\n",
    "Multiple correspondence analysis is an extension of correspondence analysis. It should be used when you have more than two categorical variables. The idea is to one-hot encode a dataset, before applying correspondence analysis to it.\n",
    "\n",
    "As an example, we're going to use the [balloons dataset](https://archive.ics.uci.edu/ml/machine-learning-databases/balloons/) taken from the [UCI datasets website](https://archive.ics.uci.edu/ml/datasets.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:03.555683Z",
     "iopub.status.busy": "2024-09-07T18:18:03.555314Z",
     "iopub.status.idle": "2024-09-07T18:18:04.552348Z",
     "shell.execute_reply": "2024-09-07T18:18:04.551791Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Color</th>\n",
       "      <th>Size</th>\n",
       "      <th>Action</th>\n",
       "      <th>Age</th>\n",
       "      <th>Inflated</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>YELLOW</td>\n",
       "      <td>SMALL</td>\n",
       "      <td>STRETCH</td>\n",
       "      <td>ADULT</td>\n",
       "      <td>T</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>YELLOW</td>\n",
       "      <td>SMALL</td>\n",
       "      <td>STRETCH</td>\n",
       "      <td>CHILD</td>\n",
       "      <td>F</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>YELLOW</td>\n",
       "      <td>SMALL</td>\n",
       "      <td>DIP</td>\n",
       "      <td>ADULT</td>\n",
       "      <td>F</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>YELLOW</td>\n",
       "      <td>SMALL</td>\n",
       "      <td>DIP</td>\n",
       "      <td>CHILD</td>\n",
       "      <td>F</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>YELLOW</td>\n",
       "      <td>LARGE</td>\n",
       "      <td>STRETCH</td>\n",
       "      <td>ADULT</td>\n",
       "      <td>T</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Color   Size   Action    Age Inflated\n",
       "0  YELLOW  SMALL  STRETCH  ADULT        T\n",
       "1  YELLOW  SMALL  STRETCH  CHILD        F\n",
       "2  YELLOW  SMALL      DIP  ADULT        F\n",
       "3  YELLOW  SMALL      DIP  CHILD        F\n",
       "4  YELLOW  LARGE  STRETCH  ADULT        T"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "dataset = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/balloons/adult+stretch.data')\n",
    "dataset.columns = ['Color', 'Size', 'Action', 'Age', 'Inflated']\n",
    "dataset.head()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fitting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:04.556009Z",
     "iopub.status.busy": "2024-09-07T18:18:04.555668Z",
     "iopub.status.idle": "2024-09-07T18:18:05.004602Z",
     "shell.execute_reply": "2024-09-07T18:18:05.003821Z"
    }
   },
   "outputs": [],
   "source": [
    "import prince\n",
    "\n",
    "mca = prince.MCA(\n",
    "    n_components=3,\n",
    "    n_iter=3,\n",
    "    copy=True,\n",
    "    check_input=True,\n",
    "    engine='sklearn',\n",
    "    random_state=42\n",
    ")\n",
    "mca = mca.fit(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The way MCA works is that it one-hot encodes the dataset, and then fits a correspondence analysis. In case your dataset is already one-hot encoded, you can specify `one_hot=False` to skip this step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.011574Z",
     "iopub.status.busy": "2024-09-07T18:18:05.011295Z",
     "iopub.status.idle": "2024-09-07T18:18:05.037415Z",
     "shell.execute_reply": "2024-09-07T18:18:05.036730Z"
    }
   },
   "outputs": [],
   "source": [
    "one_hot = pd.get_dummies(dataset)\n",
    "\n",
    "mca_no_one_hot = prince.MCA(one_hot=False)\n",
    "mca_no_one_hot = mca_no_one_hot.fit(one_hot)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Eigenvalues"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.042434Z",
     "iopub.status.busy": "2024-09-07T18:18:05.042154Z",
     "iopub.status.idle": "2024-09-07T18:18:05.061769Z",
     "shell.execute_reply": "2024-09-07T18:18:05.061044Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>eigenvalue</th>\n",
       "      <th>% of variance</th>\n",
       "      <th>% of variance (cumulative)</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>component</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.402</td>\n",
       "      <td>40.17%</td>\n",
       "      <td>40.17%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.211</td>\n",
       "      <td>21.11%</td>\n",
       "      <td>61.28%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.186</td>\n",
       "      <td>18.56%</td>\n",
       "      <td>79.84%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          eigenvalue % of variance % of variance (cumulative)\n",
       "component                                                    \n",
       "0              0.402        40.17%                     40.17%\n",
       "1              0.211        21.11%                     61.28%\n",
       "2              0.186        18.56%                     79.84%"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.eigenvalues_summary"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Coordinates"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.067678Z",
     "iopub.status.busy": "2024-09-07T18:18:05.067374Z",
     "iopub.status.idle": "2024-09-07T18:18:05.089954Z",
     "shell.execute_reply": "2024-09-07T18:18:05.089194Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.705387</td>\n",
       "      <td>9.509676e-15</td>\n",
       "      <td>0.758639</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-0.386586</td>\n",
       "      <td>7.593937e-15</td>\n",
       "      <td>0.626063</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>-0.386586</td>\n",
       "      <td>6.106546e-15</td>\n",
       "      <td>0.626063</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-0.852014</td>\n",
       "      <td>5.547435e-15</td>\n",
       "      <td>0.562447</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.783539</td>\n",
       "      <td>-6.333333e-01</td>\n",
       "      <td>0.130201</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          0             1         2\n",
       "0  0.705387  9.509676e-15  0.758639\n",
       "1 -0.386586  7.593937e-15  0.626063\n",
       "2 -0.386586  6.106546e-15  0.626063\n",
       "3 -0.852014  5.547435e-15  0.562447\n",
       "4  0.783539 -6.333333e-01  0.130201"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.row_coordinates(dataset).head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.095132Z",
     "iopub.status.busy": "2024-09-07T18:18:05.094838Z",
     "iopub.status.idle": "2024-09-07T18:18:05.122819Z",
     "shell.execute_reply": "2024-09-07T18:18:05.122131Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Color_PURPLE</th>\n",
       "      <td>0.117308</td>\n",
       "      <td>6.892024e-01</td>\n",
       "      <td>-0.641270</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Color_YELLOW</th>\n",
       "      <td>-0.130342</td>\n",
       "      <td>-7.657805e-01</td>\n",
       "      <td>0.712523</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Size_LARGE</th>\n",
       "      <td>0.117308</td>\n",
       "      <td>-6.892024e-01</td>\n",
       "      <td>-0.641270</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Size_SMALL</th>\n",
       "      <td>-0.130342</td>\n",
       "      <td>7.657805e-01</td>\n",
       "      <td>0.712523</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Action_DIP</th>\n",
       "      <td>-0.853864</td>\n",
       "      <td>-2.712409e-15</td>\n",
       "      <td>-0.079340</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     0             1         2\n",
       "Color_PURPLE  0.117308  6.892024e-01 -0.641270\n",
       "Color_YELLOW -0.130342 -7.657805e-01  0.712523\n",
       "Size_LARGE    0.117308 -6.892024e-01 -0.641270\n",
       "Size_SMALL   -0.130342  7.657805e-01  0.712523\n",
       "Action_DIP   -0.853864 -2.712409e-15 -0.079340"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.column_coordinates(dataset).head()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.127742Z",
     "iopub.status.busy": "2024-09-07T18:18:05.127386Z",
     "iopub.status.idle": "2024-09-07T18:18:05.211209Z",
     "shell.execute_reply": "2024-09-07T18:18:05.209166Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<div id=\"altair-viz-c47c977bfce242a99fcc53f51f97e581\"></div>\n",
       "<script type=\"text/javascript\">\n",
       "  var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
       "  (function(spec, embedOpt){\n",
       "    let outputDiv = document.currentScript.previousElementSibling;\n",
       "    if (outputDiv.id !== \"altair-viz-c47c977bfce242a99fcc53f51f97e581\") {\n",
       "      outputDiv = document.getElementById(\"altair-viz-c47c977bfce242a99fcc53f51f97e581\");\n",
       "    }\n",
       "    const paths = {\n",
       "      \"vega\": \"https://cdn.jsdelivr.net/npm//vega@5?noext\",\n",
       "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm//vega-lib?noext\",\n",
       "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm//vega-lite@4.17.0?noext\",\n",
       "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm//vega-embed@6?noext\",\n",
       "    };\n",
       "\n",
       "    function maybeLoadScript(lib, version) {\n",
       "      var key = `${lib.replace(\"-\", \"\")}_version`;\n",
       "      return (VEGA_DEBUG[key] == version) ?\n",
       "        Promise.resolve(paths[lib]) :\n",
       "        new Promise(function(resolve, reject) {\n",
       "          var s = document.createElement('script');\n",
       "          document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
       "          s.async = true;\n",
       "          s.onload = () => {\n",
       "            VEGA_DEBUG[key] = version;\n",
       "            return resolve(paths[lib]);\n",
       "          };\n",
       "          s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
       "          s.src = paths[lib];\n",
       "        });\n",
       "    }\n",
       "\n",
       "    function showError(err) {\n",
       "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
       "      throw err;\n",
       "    }\n",
       "\n",
       "    function displayChart(vegaEmbed) {\n",
       "      vegaEmbed(outputDiv, spec, embedOpt)\n",
       "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
       "    }\n",
       "\n",
       "    if(typeof define === \"function\" && define.amd) {\n",
       "      requirejs.config({paths});\n",
       "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
       "    } else {\n",
       "      maybeLoadScript(\"vega\", \"5\")\n",
       "        .then(() => maybeLoadScript(\"vega-lite\", \"4.17.0\"))\n",
       "        .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
       "        .catch(showError)\n",
       "        .then(() => displayChart(vegaEmbed));\n",
       "    }\n",
       "  })({\"config\": {\"view\": {\"continuousWidth\": 400, \"continuousHeight\": 300}}, \"layer\": [{\"data\": {\"name\": \"data-be5a09dde4373fbe3630bbef94f971d3\"}, \"mark\": {\"type\": \"circle\", \"size\": 50}, \"encoding\": {\"color\": {\"field\": \"variable\", \"type\": \"nominal\"}, \"tooltip\": [{\"field\": \"variable\", \"type\": \"nominal\"}, {\"field\": \"value\", \"type\": \"nominal\"}, {\"field\": \"component 0\", \"type\": \"quantitative\"}, {\"field\": \"component 1\", \"type\": \"quantitative\"}], \"x\": {\"axis\": {\"title\": \"component 0 \\u2014 40.17%\"}, \"field\": \"component 0\", \"scale\": {\"zero\": false}, \"type\": \"quantitative\"}, \"y\": {\"axis\": {\"title\": \"component 1 \\u2014 21.11%\"}, \"field\": \"component 1\", \"scale\": {\"zero\": false}, \"type\": \"quantitative\"}}, \"selection\": {\"selector001\": {\"type\": \"interval\", \"bind\": \"scales\", \"encodings\": [\"x\", \"y\"]}}}, {\"data\": {\"name\": \"data-5791ae432af1b420fffc946280282782\"}, \"mark\": {\"type\": \"circle\", \"size\": 50}, \"encoding\": {\"color\": {\"field\": \"variable\", \"type\": \"nominal\"}, \"tooltip\": [{\"field\": \"variable\", \"type\": \"nominal\"}, {\"field\": \"value\", \"type\": \"nominal\"}, {\"field\": \"component 0\", \"type\": \"quantitative\"}, {\"field\": \"component 1\", \"type\": \"quantitative\"}], \"x\": {\"axis\": {\"title\": \"component 0 \\u2014 40.17%\"}, \"field\": \"component 0\", \"scale\": {\"zero\": false}, \"type\": \"quantitative\"}, \"y\": {\"axis\": {\"title\": \"component 1 \\u2014 21.11%\"}, \"field\": \"component 1\", \"scale\": {\"zero\": false}, \"type\": \"quantitative\"}}}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v4.17.0.json\", \"datasets\": {\"data-be5a09dde4373fbe3630bbef94f971d3\": [{\"component 0\": 0.7053867996248335, \"component 1\": 9.509676301036373e-15, \"component 2\": 0.758639110569096, \"variable\": \"row\", \"value\": \"0\", \"label\": 0}, {\"component 0\": -0.38658629949599, \"component 1\": 7.593936874893582e-15, \"component 2\": 0.6260630816840033, \"variable\": \"row\", \"value\": \"1\", \"label\": 1}, {\"component 0\": -0.3865862994959898, \"component 1\": 6.106546176142449e-15, \"component 2\": 0.6260630816840016, \"variable\": \"row\", \"value\": \"2\", \"label\": 2}, {\"component 0\": -0.8520140574664051, \"component 1\": 5.547434840374113e-15, \"component 2\": 0.5624474892356499, \"variable\": \"row\", \"value\": \"3\", \"label\": 3}, {\"component 0\": 0.7835387510478193, \"component 1\": -0.6333333333333306, \"component 2\": 0.13020069134918802, \"variable\": \"row\", \"value\": \"4\", \"label\": 4}, {\"component 0\": 0.7835387510478193, \"component 1\": -0.6333333333333306, \"component 2\": 0.13020069134918802, \"variable\": \"row\", \"value\": \"5\", \"label\": 5}, {\"component 0\": -0.3084343480730042, \"component 1\": -0.6333333333333326, \"component 2\": -0.0023753375359048617, \"variable\": \"row\", \"value\": \"6\", \"label\": 6}, {\"component 0\": -0.308434348073004, \"component 1\": -0.6333333333333342, \"component 2\": -0.0023753375359064507, \"variable\": \"row\", \"value\": \"7\", \"label\": 7}, {\"component 0\": -0.7738621060434193, \"component 1\": -0.6333333333333347, \"component 2\": -0.0659909299842582, \"variable\": \"row\", \"value\": \"8\", \"label\": 8}, {\"component 0\": 0.783538751047818, \"component 1\": 0.6333333333333356, \"component 2\": 0.13020069134917486, \"variable\": \"row\", \"value\": \"9\", \"label\": 9}, {\"component 0\": 0.783538751047818, \"component 1\": 0.6333333333333356, \"component 2\": 0.13020069134917486, \"variable\": \"row\", \"value\": \"10\", \"label\": 10}, {\"component 0\": -0.30843434807300557, \"component 1\": 0.6333333333333336, \"component 2\": -0.0023753375359180247, \"variable\": \"row\", \"value\": \"11\", \"label\": 11}, {\"component 0\": -0.3084343480730054, \"component 1\": 0.6333333333333321, \"component 2\": -0.0023753375359196137, \"variable\": \"row\", \"value\": \"12\", \"label\": 12}, {\"component 0\": -0.7738621060434207, \"component 1\": 0.6333333333333315, \"component 2\": -0.06599092998427136, \"variable\": \"row\", \"value\": \"13\", \"label\": 13}, {\"component 0\": 0.8616907024708037, \"component 1\": -4.6391761163119046e-15, \"component 2\": -0.49823772787073317, \"variable\": \"row\", \"value\": \"14\", \"label\": 14}, {\"component 0\": 0.8616907024708037, \"component 1\": -4.6391761163119046e-15, \"component 2\": -0.49823772787073317, \"variable\": \"row\", \"value\": \"15\", \"label\": 15}, {\"component 0\": -0.23028239665001982, \"component 1\": -6.554915542454694e-15, \"component 2\": -0.630813756755826, \"variable\": \"row\", \"value\": \"16\", \"label\": 16}, {\"component 0\": -0.23028239665001965, \"component 1\": -8.042306241205825e-15, \"component 2\": -0.6308137567558276, \"variable\": \"row\", \"value\": \"17\", \"label\": 17}, {\"component 0\": -0.6957101546204348, \"component 1\": -8.601417576974163e-15, \"component 2\": -0.6944293492041793, \"variable\": \"row\", \"value\": \"18\", \"label\": 18}], \"data-5791ae432af1b420fffc946280282782\": [{\"component 0\": 0.11730760677191375, \"component 1\": 0.6892024376045053, \"component 2\": -0.6412704755837072, \"variable\": \"column\", \"value\": \"Color_PURPLE\", \"label\": \"Color_PURPLE\"}, {\"component 0\": -0.13034178530212623, \"component 1\": -0.7657804862272279, \"component 2\": 0.7125227506485634, \"variable\": \"column\", \"value\": \"Color_YELLOW\", \"label\": \"Color_YELLOW\"}, {\"component 0\": 0.1173076067719153, \"component 1\": -0.6892024376045175, \"component 2\": -0.6412704755836937, \"variable\": \"column\", \"value\": \"Size_LARGE\", \"label\": \"Size_LARGE\"}, {\"component 0\": -0.1303417853021279, \"component 1\": 0.7657804862272416, \"component 2\": 0.7125227506485483, \"variable\": \"column\", \"value\": \"Size_SMALL\", \"label\": \"Size_SMALL\"}, {\"component 0\": -0.8538641988881542, \"component 1\": -2.712409382769673e-15, \"component 2\": -0.07934001340795455, \"variable\": \"column\", \"value\": \"Action_DIP\", \"label\": \"Action_DIP\"}, {\"component 0\": 0.6209921446459306, \"component 1\": 1.984320188510338e-15, \"component 2\": 0.05770182793305772, \"variable\": \"column\", \"value\": \"Action_STRETCH\", \"label\": \"Action_STRETCH\"}, {\"component 0\": 0.6209921446459307, \"component 1\": 5.610433036335892e-16, \"component 2\": 0.05770182793305635, \"variable\": \"column\", \"value\": \"Age_ADULT\", \"label\": \"Age_ADULT\"}, {\"component 0\": -0.8538641988881546, \"component 1\": -7.699842392032169e-16, \"component 2\": -0.07934001340795278, \"variable\": \"column\", \"value\": \"Age_CHILD\", \"label\": \"Age_CHILD\"}, {\"component 0\": -0.7314664035372919, \"component 1\": -1.2700275846653781e-15, \"component 2\": -0.05473108398079316, \"variable\": \"column\", \"value\": \"Inflated_F\", \"label\": \"Inflated_F\"}, {\"component 0\": 1.2539424060639295, \"component 1\": 2.2457640234049206e-15, \"component 2\": 0.09382471539564527, \"variable\": \"column\", \"value\": \"Inflated_T\", \"label\": \"Inflated_T\"}]}}, {\"mode\": \"vega-lite\"});\n",
       "</script>"
      ],
      "text/plain": [
       "alt.LayerChart(...)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.plot(\n",
    "    dataset,\n",
    "    x_component=0,\n",
    "    y_component=1,\n",
    "    show_column_markers=True,\n",
    "    show_row_markers=True,\n",
    "    show_column_labels=False,\n",
    "    show_row_labels=False\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Contributions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.215554Z",
     "iopub.status.busy": "2024-09-07T18:18:05.214851Z",
     "iopub.status.idle": "2024-09-07T18:18:05.244981Z",
     "shell.execute_reply": "2024-09-07T18:18:05.243455Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "</style>\n",
       "<table id=\"T_e4b1a\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_e4b1a_level0_col0\" class=\"col_heading level0 col0\" >0</th>\n",
       "      <th id=\"T_e4b1a_level0_col1\" class=\"col_heading level0 col1\" >1</th>\n",
       "      <th id=\"T_e4b1a_level0_col2\" class=\"col_heading level0 col2\" >2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_e4b1a_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
       "      <td id=\"T_e4b1a_row0_col0\" class=\"data row0 col0\" >7%</td>\n",
       "      <td id=\"T_e4b1a_row0_col1\" class=\"data row0 col1\" >0%</td>\n",
       "      <td id=\"T_e4b1a_row0_col2\" class=\"data row0 col2\" >16%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e4b1a_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
       "      <td id=\"T_e4b1a_row1_col0\" class=\"data row1 col0\" >2%</td>\n",
       "      <td id=\"T_e4b1a_row1_col1\" class=\"data row1 col1\" >0%</td>\n",
       "      <td id=\"T_e4b1a_row1_col2\" class=\"data row1 col2\" >11%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e4b1a_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
       "      <td id=\"T_e4b1a_row2_col0\" class=\"data row2 col0\" >2%</td>\n",
       "      <td id=\"T_e4b1a_row2_col1\" class=\"data row2 col1\" >0%</td>\n",
       "      <td id=\"T_e4b1a_row2_col2\" class=\"data row2 col2\" >11%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e4b1a_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
       "      <td id=\"T_e4b1a_row3_col0\" class=\"data row3 col0\" >10%</td>\n",
       "      <td id=\"T_e4b1a_row3_col1\" class=\"data row3 col1\" >0%</td>\n",
       "      <td id=\"T_e4b1a_row3_col2\" class=\"data row3 col2\" >9%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e4b1a_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
       "      <td id=\"T_e4b1a_row4_col0\" class=\"data row4 col0\" >8%</td>\n",
       "      <td id=\"T_e4b1a_row4_col1\" class=\"data row4 col1\" >10%</td>\n",
       "      <td id=\"T_e4b1a_row4_col2\" class=\"data row4 col2\" >0%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x14745b7d0>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.row_contributions_.head().style.format('{:.0%}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.251107Z",
     "iopub.status.busy": "2024-09-07T18:18:05.249993Z",
     "iopub.status.idle": "2024-09-07T18:18:05.270271Z",
     "shell.execute_reply": "2024-09-07T18:18:05.268359Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "</style>\n",
       "<table id=\"T_f1141\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_f1141_level0_col0\" class=\"col_heading level0 col0\" >0</th>\n",
       "      <th id=\"T_f1141_level0_col1\" class=\"col_heading level0 col1\" >1</th>\n",
       "      <th id=\"T_f1141_level0_col2\" class=\"col_heading level0 col2\" >2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_f1141_level0_row0\" class=\"row_heading level0 row0\" >Color_PURPLE</th>\n",
       "      <td id=\"T_f1141_row0_col0\" class=\"data row0 col0\" >0%</td>\n",
       "      <td id=\"T_f1141_row0_col1\" class=\"data row0 col1\" >24%</td>\n",
       "      <td id=\"T_f1141_row0_col2\" class=\"data row0 col2\" >23%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_f1141_level0_row1\" class=\"row_heading level0 row1\" >Color_YELLOW</th>\n",
       "      <td id=\"T_f1141_row1_col0\" class=\"data row1 col0\" >0%</td>\n",
       "      <td id=\"T_f1141_row1_col1\" class=\"data row1 col1\" >26%</td>\n",
       "      <td id=\"T_f1141_row1_col2\" class=\"data row1 col2\" >26%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_f1141_level0_row2\" class=\"row_heading level0 row2\" >Size_LARGE</th>\n",
       "      <td id=\"T_f1141_row2_col0\" class=\"data row2 col0\" >0%</td>\n",
       "      <td id=\"T_f1141_row2_col1\" class=\"data row2 col1\" >24%</td>\n",
       "      <td id=\"T_f1141_row2_col2\" class=\"data row2 col2\" >23%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_f1141_level0_row3\" class=\"row_heading level0 row3\" >Size_SMALL</th>\n",
       "      <td id=\"T_f1141_row3_col0\" class=\"data row3 col0\" >0%</td>\n",
       "      <td id=\"T_f1141_row3_col1\" class=\"data row3 col1\" >26%</td>\n",
       "      <td id=\"T_f1141_row3_col2\" class=\"data row3 col2\" >26%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_f1141_level0_row4\" class=\"row_heading level0 row4\" >Action_DIP</th>\n",
       "      <td id=\"T_f1141_row4_col0\" class=\"data row4 col0\" >15%</td>\n",
       "      <td id=\"T_f1141_row4_col1\" class=\"data row4 col1\" >0%</td>\n",
       "      <td id=\"T_f1141_row4_col2\" class=\"data row4 col2\" >0%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x147429950>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.column_contributions_.head().style.format('{:.0%}')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cosine similarities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.274947Z",
     "iopub.status.busy": "2024-09-07T18:18:05.274461Z",
     "iopub.status.idle": "2024-09-07T18:18:05.303366Z",
     "shell.execute_reply": "2024-09-07T18:18:05.302997Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.461478</td>\n",
       "      <td>8.387409e-29</td>\n",
       "      <td>0.533786</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.152256</td>\n",
       "      <td>5.875091e-29</td>\n",
       "      <td>0.399316</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.152256</td>\n",
       "      <td>3.799023e-29</td>\n",
       "      <td>0.399316</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.653335</td>\n",
       "      <td>2.769663e-29</td>\n",
       "      <td>0.284712</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.592606</td>\n",
       "      <td>3.871772e-01</td>\n",
       "      <td>0.016363</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          0             1         2\n",
       "0  0.461478  8.387409e-29  0.533786\n",
       "1  0.152256  5.875091e-29  0.399316\n",
       "2  0.152256  3.799023e-29  0.399316\n",
       "3  0.653335  2.769663e-29  0.284712\n",
       "4  0.592606  3.871772e-01  0.016363"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.row_cosine_similarities(dataset).head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-09-07T18:18:05.304890Z",
     "iopub.status.busy": "2024-09-07T18:18:05.304809Z",
     "iopub.status.idle": "2024-09-07T18:18:05.319101Z",
     "shell.execute_reply": "2024-09-07T18:18:05.318834Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Color_PURPLE</th>\n",
       "      <td>0.015290</td>\n",
       "      <td>5.277778e-01</td>\n",
       "      <td>0.456920</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Color_YELLOW</th>\n",
       "      <td>0.015290</td>\n",
       "      <td>5.277778e-01</td>\n",
       "      <td>0.456920</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Size_LARGE</th>\n",
       "      <td>0.015290</td>\n",
       "      <td>5.277778e-01</td>\n",
       "      <td>0.456920</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Size_SMALL</th>\n",
       "      <td>0.015290</td>\n",
       "      <td>5.277778e-01</td>\n",
       "      <td>0.456920</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Action_DIP</th>\n",
       "      <td>0.530243</td>\n",
       "      <td>5.350665e-30</td>\n",
       "      <td>0.004578</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     0             1         2\n",
       "Color_PURPLE  0.015290  5.277778e-01  0.456920\n",
       "Color_YELLOW  0.015290  5.277778e-01  0.456920\n",
       "Size_LARGE    0.015290  5.277778e-01  0.456920\n",
       "Size_SMALL    0.015290  5.277778e-01  0.456920\n",
       "Action_DIP    0.530243  5.350665e-30  0.004578"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mca.column_cosine_similarities(dataset).head()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  },
  "vscode": {
   "interpreter": {
    "hash": "441c2ec70d9faeb70e7723f55150c6260f4a26a9c828b90915d3399002e14f43"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
